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Summary. The funnel plot Is a graphical visualisation of summary data estimates from a meta-analysis, and 
Is a useful tool for detecting departures from the standard modelling assumptions. Although perhaps not 
widely appreciated, a simple extension of the funnel plot can help to facilitate an Intuitive Interpretation of 
the mathematics underlying a meta-analysis at a more fundamental level, by equating It to determining the 
centre of mass of a physical system. We used this analogy, with some success, to explain the concepts of 
weighing evidence and of biased evidence to a young audience at the Cambridge Science Festival, without 
recourse to precise definitions or statistical formulae. In this paper we aim to formalise this analogy at a more 
technical level using the estimating equation framework: firstly, to help elucidate some of the basic statistical 
models employed In a meta-analysis and secondly, to forge new connections between bias adjustment In the 
evidence synthesis and causal Inference literatures. 

Keywords: Meta-analysis, Funnel plot. Causal Inference, Estimating equations, Egger regression, 
Mendelian randomization. 

1. Introduction 

Let Ui, i = l,...,k, represent summary estimates of the same apparent quantity from k independent 
information sources in a meta-analysis. The Lth estimate is associated with a fixed and known variance, 
sj. The standard fixed effect model assumes 


yi = p + Si^i, ei ~ iV(0,1) 


( 1 ) 


The focus for inference, in terms of point estimates and confidence intervals is the population mean effect 
parameter p. The fixed effect estimate fi and its variance are given by the well known formula 


E k 

i=i WiVi 

E — 


k 

Var(/i) = 1/ X Wj. 

i=l 


( 2 ) 
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Meta-analysis has a long history in medical research, where the information source has traditionally been 
a randomized clinical trial, comparing an experimental treatment against standard therapy. However, its 
use spans the entire scientific spectrum, for example in the areas of psychology (Hedges, 1992); inter¬ 
laboratory experiments (Paule and Mandel, 1982); particle physics (Baker and Jackson, 2013); Mendelian 
randomization (Burgess and Thompson, 2010) and many others. 

Regardless of the application area, if Sj and e* are mutually independent, then equation Q will give 
a consistent estimate for /r. In order to provide a convenient shorthand that easily generalizes to more 
complex models, we write to denote this assumption. 


1.1. The funnel plot and the fixed effect model 

The funnel plot (Sterne and Egger, 2001) is a graphical visualisation of data from a meta-analysis. It 
plots study z’s effect size on the x-axis versus a measure of precision on the y-axis (usually 1/sj). The 
independence assumption JT( 5 . g.) of model Q implies that there should be no correlation between effect 
size and precision. If this is the case, then the plot should appear symmetrical, in that the results from 
smaller studies should funnel in towards those from larger studies. It therefore provides a simple means 
to visually assess the plausibility of 

Although perhaps not as widely appreciated, a simple extension of the funnel plot can help to facili¬ 
tate an intuitive interpretation of the mathematics underlying a meta-analysis at a more fundamental 
level, by equating it to determining the centre of mass of a physical system. This analogy was recently 
exploited by MRC scientists to build a machine (named the ‘Meta-Analyzer’) to explain the concepts of 
weighing evidence and of biased evidence to a young audience at the Cambridge Science Festival, without 
recourse to precise definitions or statistical formulae. We have since produced a web-emulation of the 
machine to bring these ideas to an even wider audience of students and researchers - it can be found at 
https://chjackson.shinyapps.io/MetaAnalyser/. 

Figure shows the Meta-Analyzer web app populated with a fictional meta-analysis of A:=13 studies. 
In it the standard funnel plot has been augmented so that the area representing point i is proportional 
to study z’s fixed effect weight Wi = l/sf, in order to promote its interpretation as a physical mass. It is 
proportional because, for study f, we show its weight in the Meta-Analyzer as a percentage of the total 
weight in the analysis, which equals lQhx{wi/ 

Points are joined by horizontal cord to a pole that is joined itself to a vertical stand at a pivot point, p 
say. Study weights are imagined to exert a downward force due to gravity. Since the pole is perfectly 


Weighing evidence with the Meta-Anaiyzer 3 



Estimate 

Fig. 1: The meta-analyzer supporting a fictional body of 13 study results. The centre of mass (overall 
estimate) is located at zero. 


horizontal, it is intuitively understood that p satisfies the physical law: 

k 

'^mivi-p) = 0 , 

i=l 

and is therefore equal to the centre of mass (Beatty, 2005). The above formula can be viewed as a rudi¬ 
mentary estimating equation, a construction we continue to utilize throughout this paper. It is simple 
to verify that p is identical to the fixed effect estimate /I in Q. The length of the stand base along the 
x-axis shows the 95% confidence interval for g. Again, there is a nice physical analogy to draw. When 
there is a lot of uncertainty as to the overall effect estimate, so that Var(/i) is large, the stand length 
must be wide in order to properly account for its instability. Conversely, when the overall effect estimate 
is very precise, so that Var(/i) is small, the stand length need only be short. When populated with such 
idealised data as in Figure the Meta-Analyzer may remind some of Gabon’s bean board or quincunx, 
a tool that also exploits gravity to illustrate how statistical laws can emerge from a seemingly random 
physical process. 

Although the Meta-Analyzer was intended to be a simple educational tool to demonstrate the basics 
of evidence synthesis, we have continued to find the physical system it describes useful more broadly, to 
easily explain some common secondary issues in meta-analysis and to help make new connections between 
statistical techniques for bias adjustment from the literature that seem, at first sight, unrelated. In Section 
2 we show that it helps to transparently demonstrate the implications of moving from a fixed to a random 
effects model and to assess the influence of outlying studies. In Section 3 we discuss the issue of biased 







4 J Bowden et. al. 

evidence, how the Meta-Analyzer was used to explain this concept to a lay audience, and how small study 
bias is commonly addressed by medical statisticians using Egger regression (Egger et al, 1997). In Section 
4, we review the method of Mendelian randomization - a technique for estimating the causal effect of 
a modifiable exposure on a health outcome using observational data, by circumventing the problem of 
confounding. We draw parallels between adjusting for bias in Mendelian randomization and small study 
bias in meta-analysis, and describe an extension to the standard model assumed by Egger regression first 
proposed in Bowden, Davey Smith and Burgess (2015) within the context of Mendelian randomization, 
that can (in theory) be applied to both fields. In Section 5 we show that, by viewing Egger regression 
from a causal inference perspective, a novel estimating equation interpretation of this method is found 
that can be intuitively visualized via the Meta-Analyzer. We illustrate this new interpretation on some 
real data examples and conclude with a discussion in Section 6. 

2. Random effects models 

A common issue in meta-analysis is accounting for between study heterogeneity. In a fixed effect meta¬ 
analysis all studies are assumed to provide an estimate of the same quantity, and the only difference 
between studies is in the precision of their respective estimates for this quantity. This is often thought to 
be an over-simplistic model, especially when flatly contradicted by Cochran’s Q-statistic, Q = Yli=i ^iyi~ 
/i)^, being substantially larger than its expected value of k-1 under model (1), and a random effects model 
is preferred. Two distinct approaches for incorporating heterogeneity have emerged. The first, and most 
popular is via an additional additive random effect, as in model ([^ below (e.g. DerSimonian and Laird, 
(1986), Higgins and Thompson, (2002)). The second is via the addition of a multiplicative scale factor, 
as in model (Q below (for example Thompson and Sharp (1999), Baker and Jackson (2013)). 


Tji = ^i + Siti + Si, ~ lV(0,r^), ei~iV(0,1). 
Vi = n + (j>hiei, ei~iV(0,l). 


( 3 ) 

( 4 ) 


We first note that, regardless of whether model Q or (|^ holds in practice, under the independence 
assumption consistent estimation of the overall mean parameter /r is achieved by fitting the 

fixed effect model Q- In practice, estimation of /x can follow by simple application of formula Q except 
now the weight given to study i, Wi, changes to pqiyj and under models (3) and (4) respectively. 

The point estimate for /x obtained by fitting model Q is identical to that obtained from fitting model ([^ . 
This is because the common term 0 simply cancels from the numerator and denominator in Q and only 
the variance of the estimate is altered. However, both the point estimate and variance for \i change under 
the additive random effect model (©. Because its use is ubiquitous, especially in the field of medical 
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research (Baker and Jackson, 2013), we focus only on the additive model for the remainder of this section. 

If, after fitting the fixed effect model Q , Cochran’s Q statistic about fi indicates additional heterogeneity 
(Q > k — 1), then it is common practice to estimate using the procedure defined by DerSimonian and 
Laird (1986), to give This estimate conveniently provides a link between Q and a popular measure 
of heterogeneity, /^, (Higgins, 2002) as follows: 

Q - {k — 1) _ 'f'pL _ j2 

Q -II + ’ 

where slyp is referred to as the ‘typical’ within study variance. 

2.1. Random effects models via the Meta-Analyzer 

In order to provide a physical interpretation of the calculations underpinning a meta-analysis so that 
they can be implemented using the Meta-Analyser, we formulate a system of estimating equations (as a 
natural progression of the single estimating equation for fixed effect model (I)) to fit the random effects 
model to find g and as below: 


Weight equation: Wi 
k 

Mean equation: E Wiivi - g) 
i=l 


k 

Heterogeneity equation: Wi{yi — g)"^ — {k — 1) 

i=l 


1/(4+ r^) 
0 

0 . 


(5) 

( 6 ) 

(7) 


Formula Q is referred to as the generalized Q statistic (Bowden et al, 2011) and, when solved in 
conjunction with ([^ and ([^, it returns an estimate for g and the Paule-Mandel (PM) estimate for 
(Paule and Mandel, 1982), which we denote by fpM- As with the PM estimate is constrained to 
be positive, it is known to provide a more reliable estimate for the between study heterogeneity than 
'^DL (Veroniki et al, 2015). Random effects model ^ has been promoted by the Cochrane collaboration 
(Higgins and Green, 2011) and formally justified as a basis for inference beyond the current meta-analysis 
to future studies and populations (Higgins, Thompson and Speigelhalter, 2009). It reduces to the fixed 
effects model when is either fixed or estimated to be 0. This implies gi = g for all i. 


Application of the random effects model, with additional variance component r^, leads to study re¬ 
sults being both down-weighted and more similarly weighted. Furthermore, the original weight given to 
large studies is reduced to a greater extent than those of smaller studies. This issue is fairly subtle and 
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hard to comprehend, but the Meta-Analyzer provides a very simple visualization, by adjusting the mass 
of each weight. Figure]^ (left) shows the Meta-Analyzer populated with a data set of 8 randomized trial 
results, each assessing the use of magnesium to treat myocardial infarction. The effect measure, y*, is the 
log-odds ratio of death between treatment and control groups for study i. These data were previously 
analyzed by Higgins and Spiegelhalter (2002b). 


Between trial heterogeneity is present for these data (t|,^ = 0.095, = 27.6%). Under random ef¬ 

fects model (3) Wi is reduced from ^ to l/(s^ -|- f^). We represent the weight ‘loss’ induced by moving 
from a fixed to an additive random effects model by ‘drilling out’ a square of length Xi, in order to satisfy 
the Pythagorean identity xf -|- ^ 2^-2 = ^, as illustrated in Figure The centre of mass defined by 
the holed-out weights and calculated by the Meta-Analyzer is automatically consistent with the random 
effects estimate for y. In Figure we plug in for for illustration. Full results obtained via the 
estimating equation approach (and so using the PM estimate for r^) are shown in Table 



Fig. 2: The Meta-Analyzer supporting the magnesium data under a random effects model for all trials 
(left); and with the Shechter trial removed (right). 


The fact that large studies lose more of their relative weight than small studies under an additive 
random effects model is immediately apparent from Figure]^ (left). We note briefly that no holing out is 
necessary when the multiplicative random effect model (Q is used. This is because the constant factor cf 
does not alter the weight given to study i as a proportion of the total weight in the analysis, whatever its 
value. However, when 4> 1 then the variance of fi will differ to that of the fixed effect model and so the 

stand length (confidence interval) will subsequently change. 


2.2. Sensitivity to out Hers 

The amount of heterogeneity estimated in a meta-analysis can depend heavily on extreme, and often 
small, study results (Bowden et al, 2011). It is therefore useful in some circumstances to perform a sensi- 
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(a) (b) 


Si 



Fig. 3: (a) Weight given to study i in a fixed effect meta-analysis, (b) Weight given to study i in an 
additive random effects meta-analysis. A Square of length Xi is removed from weight i in order to satisfy 
the Pythagorean formula. 


tivity analysis, in which an outlying study result is excluded. Figure]^ (right) shows the Meta-Analyzer 
supporting the magnesium data under random effects model (3) excluding the Shechter study (shown in 
grey). The solid black support stand in Figure]^ (right) shows the overall estimate and corresponding 95% 
confidence interval in this case. The grey support stand shows the original point estimate and confidence 
interval. In this example, exclusion of the outlying study removes a large proportion of the between trial 
heterogeneity (updated = 0.012, = 5.2%), making interpretation of the remaining trial data easier. 

Our web application facilitates easy transitions between various models like this as part of a sensitivity 
analysis. Users also see the Meta-Analyzer dynamically tip and re-balance in response to their latest 
analysis choice. 


Model 

Parameter 

Est 

S.E 

t value 

p-value 


All studies 



h 

-0.516 

0.214 

-2.408 

0.047 

t-2 

^PM 

0.084 

- 

- 

- 

rip 

0.095 (/2 = 27.6%) 

- 

- 

- 


Shechter study removed 



h 

-0.362 

0.219 

-1.653 

0.149 

t-2 

^PM 

0.008 

- 

- 

- 

rlj. 

0.012 (/2 = 5.1%) 

- 

- 

- 


Table 1: Meta-analysis of the Magnesium data under random effects model (3), with and without the 
Shechter trial. 


3. Small study bias 

3.1. The Aspirin data 

Figure]^ (left) shows the Meta-Analyser enacted on 63 randomized controlled trials reported by Edwards 
et al. (1998) that each investigated the benefit of oral Aspirin for pain relief. Study estimates yi represent 
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the log-odds ratios for the proportion of patients in each arm who had at least a 50% reduction in pain. 
Between trial heterogeneity was present for these data (r|,^ = 0.04, P = 10%) and Figure]^ (left) reflects 
the weight given to each study by the Meta-Analyzer under the random effects model ([^ using as 
the heterogeneity parameter estimate. Despite the apparent between trial heterogeneity, the conclusion 
of the random effects meta-analysis is that oral Aspirin is an effective treatment, the combined log-odds 
ratio estimate is 1.26 in favour of Aspirin with a 95% confidence interval (1.1,1.41). Full results are shown 
in Table [2j 
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Fig. 4: Meta-Analyzer supporting the Aspirin data under a random effects model. 


The hypothetical data shown in Figure 1 is perfectly symmetrical about its centre of mass, indicating 
that there is no correlation between effect size and precision across studies. However, there is a clear 
asymmetry present in the Aspirin data, smaller studies tend to show larger effect size estimates, whereas 
larger studies tend to report more modest results. For these data, Cor(yj,l/sj) = -0.7, which suggests 
does not hold under model Q. The phenomenon of observing a negative correlation between 
study precision and effect size is often given the umbrella term ‘small study bias’ (Egger et ah, 1997; 
Sterne et ah, 2011; Rucker et ah, 2011). 


3.2. The causes and consequences of small study bias 

Small study bias could actually be caused by real differences between small and large studies. Small trials 
may employ a more intensive intervention and therefore generate a greater effect on disease outcomes than 
larger trials (Egger et al, 1997; Bowater and Escrela, 2013). Asymmetry could also be a simple artefact 
of the data. Eor example, point estimates are not strictly independent of their estimated variances when 
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calculated from binary or count outcomes, (Harbord et. al, 2006, Peters et. al, 2010). However, its cause 
could be also be more sinister. Publication bias, or the file-drawer problem (Rosenthal, 1979) occurs 
when journals selectively publish study results that achieve a high level of statistical significance, and also 
induces asymmetry. 

Much attention has been focused on methods to adjust for small study bias assuming that it is caused 
by selective dissemination of research hndings, a practice which unfortunately is prevalent in biomedical 
research (Dwan et al, 2008). A common strategy is to propose an underlying selection model that lead to 
the generation of biased data. Many have followed Hedges (1984,1992), Copas (1999) and Copas and Shi 
(2000) in assuming the probability of observing a study result is some function of its precision. Whatever 
its true cause, when small study bias is present it can severely and adversely affect the conclusions reached. 
For example, it can lead one to detect apparent between trial heterogeneity when, in truth, none exists, 
and it can induce substantial bias into the overall estimate, /i, particularly under the random effects model 
(§, because it gives more relative weight to small studies than the fixed effect model (Henmi and Copas, 
2010, Bowden et al, 2011). Indeed, for the Aspirin data, we see a slight reduction in the log-odds ratio 
estimate under the multiplicative random effects model (Table [^. In the case of the Aspirin data, it is 
reasonable to suspect that the effect of oral Aspirin on pain relief is substantially smaller than suggested 
by either random effects analysis. 


3.3. Small study bias explained to a lay audience via the Meta-Analyzer 

When attempting to explain the concept of biased evidence and of bias adjustment to the science festival 
audience, we opted for a simplified version of Trim and Fill (Duval and Tweedie, 2000). It aims to replace 
‘missing studies’ in a meta-analysis by a process of reflection, until symmetry in the funnel plot is restored. 
We illustrated their idea within the context of a Sherlock Holmes’ style mystery (see Box 1 and Figure 9 in 
the Appendix which shows the Meta-Analyzer at its initial conceptual state and in situ at the Cambridge 
Science Festival (\protect\vrulewidthOpthttp://www.sciencefestival.cam.ac.uk/ ). 


3.4. Egger regression 

Despite the intuition and appeal of its end result, the mathematics behind Trim and Fill are quite 
complicated. Perhaps due to its relative simplicity, the most popular approach to testing and adjusting 
for small study bias in medical research is Egger regression (Egger et ah, 1997). This assumes the following 
linear fixed effect model in order to explain the correlation between yi and 1/sy. 

~ = /3o + ~ + Ci, Cj ~ A^(0,1). 


(8) 
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Model 

Parameter 

Est 

S.E 

t value 

p-value 


Random effects model (3) 


P 


1.26 

0.082 

15.4 

<10e-16 

r 2 


0.04 

- 

- 

- 


Random effects model (4) 


P 


1.23 

0.080 

15.4 

<10e-16 

(t> 


1.11 

- 

- 

- 


Egger regression model (10) 

Po 


2.11 

0.31 

6.77 

5.8e-09 



0.025 

0.19 

0.13 

0.89 

(t> 


0.64 

- 

- 

- 


Table 2: Results for Meta-analyses of the Aspirin data. 


Testing for small study bias is then equivalent to testing Hq: j3o = 0. If model Q and _LL(^. hold, 
then the overall effect estimate, fi, adjusted for possible small study bias (via /3o) is a consistent estimate 
for the overall treatment effect, p,. Several authors have considered the addition of random effects into 
model Q, in order account for possible residual heterogeneity after adjustment for small study bias, see 
for example and Moreno et al, (2009), Peters et al, (2010) and Ritcker et al, (2011). Their approaches 
have been straightforward generalisations of the additive and multiplicative random effects models Q 
and Q respectively, as below: 


I ^ I I 

— — Po H-1-1“ 

Si Si Si 

— — PoH- 

Si Si 


Si ~ 

e* ~ iV(0,1). 


iV(0,l). 


(9) 

( 10 ) 


Again, regardless of whether model Q or (10) holds in practice, under _LL(s. , 5 . consistent estimation 
of the overall mean parameter /i follows from fitting the standard fixed effect Egger model ([ 8 |. 


At first sight, model ([^ seems the most natural extension to model ([^. However, when a non-zero 
value for is estimated under model ([^, the resulting overall estimate for /u differs from, and can often 
exhibit more substantial bias than, the fixed effect estimate (Riicker et al. (2011)). By contrast, multi¬ 


plicative model ( 10 ) is far more well behaved in this respect since its point estimate is identical to that 
obtained from fitting model (I^. Nullifying the influence of variance components on the overall mean, a 


property enjoyed by model ( 10 ), is so attractive in the presence of small study bias, that approaches have 


been developed to artificially incorporate this feature into additive random effects models as well (Henmi 
and Copas, 2010). 
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For the reasons outlined above, we analyse the Aspirin data using the multiplicative Egger regression 


model only. This is straightforward, because model (10) is automatically fitted by the process of standard 
linear regression using the least squares criterion. For these data /3o = 2.11, with a p-value of approxi¬ 
mately 1 X 10“®, jl is equal to 0.025, with a p-value of 0.89 and (j) = 0.64. In summary, Egger regression 
detects a highly signihcant presence of small study bias and, after this has been removed, no evidence of 
a treatment effect whatsoever. A slightly concerning feature is the severe under-dispersion in the residual 
error after adjustment for small study bias, an issue we return to in Section 5. 


4. Mendelian randomization and meta-anaiysis 

Mendelian randomization (MR) (Davey Smith and Ibrahim, 2003) is a technique applied to observational 
data, that exploits information on genetic variants to infer whether a modifiable risk factor, X, has a 
causal effect on a health outcome, Y. Straightforward analysis methods with observational data are 
vulnerable to confounding bias, and estimates of association can only have a true causal interpretation 
if all confounders between X and Y (which we denote by U) have been adjusted for. An MR study 
can also be used in place of a randomized clinical trial, when such a trial can not be implemented for 
ethical, practical or financial reasons. The method requires that a genetic variant, Gj, exists that satisfies 
the ‘Instrumental Variable’ (IV) assumptions, see Bowden and Turkington (1990) for a thorough review. 
Specifically, Gi must be: (i) associated with A; (ii) not associated U; and (hi) not associated with Y given 
X and U. These assumptions are encoded by the solid arrows in Figure The causal effect of A on V is 
the parameter of interest and, following the notation in this paper, is denoted by g. The Wald estimator 



Poi 


Fig. 5: Causal diagram representing the standard IV assumptions (solid lines), versus violations of the 
assumptions (dotted lines) 


is the ratio of the gene i outcome association, jj-YGi: and the gene i exposure association, gxGi^ giving 
fii = Under the IV assumptions, gvGi tends towards the product of the gene-exposure association 

and the causal effect, gjlxGi-, as the sample size grows large so that the Wald estimate it is asymptotically 
unbiased. 
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Estimation of /i via a MR analysis incorporating multiple uncorrelated genetic variants Gi,can be 
viewed as equivalent to a fixed effect meta-analysis of the Wald ratio estimates fii, ■■■, fik (Johnson, 2014). 
So, formula Q can be applied in the MR context to yield 


A = 


Eti 

EIiAIg. 


( 11 ) 


Relating this back to the general meta-analysis context, we can equate the gene-exposure association, 
AxGi) to a within study precision, l/sj, and the causal effect estimate, jli, with the effect estimate for 
study i. Hi. Just as for a fixed effects meta-analysis, formula © assumes that fixGi^ although estimated, 
is known. It also assumes for simplicity that all variants have the same allele frequency, but this can easily 


be accounted for if false. Formula (11) has the advantage of being calculable directly from summary data 
estimates, not therefore requiring data at the individual level. 


4.1. Pleiotropy and small study bias 

In a short period, the number of genetic variants with summary data estimates available for a given MR 
study has gone from a handful to, in many cases, several hundred, thus fuelling an exponential growth in 
the technique’s application and its statistical power. However, as the number of genetic variants utilized 
in a MR analysis increases, so too does the likelihood that some of the included variants are invalid 
instrumental variables, which could potentially bias the analysis. The dotted lines in Figure highlight 
these possible violations, namely (ii) that a variant may be associated with a confounder of X and Y and 
(iii), that a variant may affect the outcome via a completely independent pathway than X. Violation 
of assumptions (ii) and (iii) can not be directly tested, and together the magnitude of their violation is 
referred to as the pleiotropic effect of a variant. Substituting l/s* for jlxGii Vi for jli and denoting the 
total pleiotropic effect of variant i by /3oi, we can write the following more plausible model for the i’th 
Wald ratio estimate in a MR analysis using the notation of this paper as follows: 


- = Poi + -+€i. 

Si Si 

= + ^ + + ei~-V(0,l) '4>i N(0, 


( 12 ) 


Model (12), first conceptualized by Bowden, Davey Smith and Burgess (2015) but clarified here, allows 
the causal effect estimate of variant i to be composed of a common causal effect, /r, and an additional, 
individual pleiotropy term, Aoi = Ao + A’i- For convenience we assume that Ai is normally distributed with 
zero mean and variance The standard method of analysis for estimating the overall causal effect ~ 
formula ( [II| ) - assumes all genetic variants are valid IVs (Aoi = 0 for all i). Egger regression implemented 
via model (l8|) - which ignores the true heterogeneous nature of the bias by estimating only its mean value. 
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/3o - can still consistently estimate the causal effect, g, even when all genetic variants exhibit pleiotropy, 


as long as holds for model (12). Under this assumption: 


^ _ Cov(j/, _ Cov(e, Cov(V:, i) 

^ Var(l/s) ^ Var(l/s) Var(l/s) 


(13) 


and as the number of variants k grows large, the covariance terms above tend to 0 and g ^ g. Bowden, 
Davey Smith and Burgess (2015) show that _LL(^. is only plausible if IV assumptions (i) and (ii) hold. 
If assumption (ii) is violated (as indicated by the dotted line from Gi to the confounder U in Figure]^, 'i/’i 
and Si = will contain common terms via re-introduced confounding and will be positively correlated, 
thus Egger regression will no longer provide a consistent estimate for g. 


5. A causal interpretation of Egger regression 


The application of Egger regression to a causal inference problem like Mendelian randomization is sur¬ 
prising, since this field has traditionally been dominated by instrumental variable methods that are 
conceptually very different. One such technique is the Structural Mean Model framework, which uses 
potential outcomes (Rubin, 2005) and places independence at the heart of its estimation strategy via 
the technique of G-estimation. For a general overview of this method see Vansteelandt and Joffe (2014) 
and, within the context of Mendelian randomization, Bowden and Vansteelandt, (2011). By making an 
analogy with Structural Mean Models, we now show that Egger regression can be understood as a means 
to de-bias a meta-analysis by restoring symmetry to the funnel plot, in a different but complimentary 


way to Trim and Fill. We start by assuming equation (10) as a ‘working’ mean model, even if in reality 


we believe that the heterogeneous bias model (12) is more realistic. We multiply each side of the working 
model by Sj and subtract jdosi to yield 


l/i(/3o) = yi- l^oSi = g + 4)^eiSi. 


The term yi(/3o) is a transformed version of the effect size estimates, can be viewed as a potential outcome, 
and is theoretically mean independent of Si under TL^^. The intercept estimate obtained from fitting 


model (10), /3o, can be viewed as dehning a particular transform of the data yi{(3o) that forces yi{(3o) to be 


independent of Si across all studies. Taking this further, model (10) is equivalent to solving the following 
system of estimating equations; 
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Weight equation: Wi 

= 1/(K^) 


Potential outcome transform: yi{(5o) 

U 

= yi- PoSi 


K 

Mean equation: Wi {yi(/3o) — y} 

2 = 1 

= 0 

(14) 

k 

G-estimation equation: Wi {yi(/3o) ~ p} (^i — s) 

2=1 

= 0 

(15) 

k 

Heterogeneity equation: Wi {yi{(3o) — “ (^ “ 2) 

= 0, 

(16) 


2 = 1 


where s is the arithmetic mean of the Si terms. We now clarify the connection between the above 
system of estimating equations and estimation of /3o, /r and (j) using standard linear regression theory. 
Fitting model Q to obtain estimates for /3o and /x, is equivalent to solving equations (14) and (15) 
leaving (p unspecified (in the Appendix we provide some simple R code to verify this). We then formally 


define </> as a parameter and solve equation (16) to give 


E k 




k-2 


(17) 


We note the equivalence of the numerator of equation 0 to the Q statistic defined in Riicker et al. 
(2011). The variances of f3o and fl are given by 




<(> 




==—, Var(/3o) = Var(/x)s2, 


where is the sample mean of the squared, within study variances. 


Concerns over the use of Egger regression have been raised when analysing binary data because a study’s 
outcome estimate will not truly be independent of its standard error, even when small study bias is not 
present. For this reason, Peters et al (2010) propose to replace Si in the Egger regression equation with 
a measure of precision based on study size, l/nj say. This could easily be represented in our estimating 
equation with suitable modification. For example, the G-estimation formula above would then be used to 
find the /3o that forces independence between yiiPo) = Vi — fio/rii and l/uj. 


5.1. Re-analysis of the Aspirin data 

Returning to the Aspirin data, Figure (left) shows the final resting point of the Meta-Analyzer upon 
enacting Egger regression using the causal interpretation described above. Once the Egger regression 
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option is ticked, users observe a dynamical change in the Meta-Analyzer from its initial starting point of 
the standard random effects analysis in Figure The x-axis position is now the transformed or potential 
outcome scale yi{(3o) = for study i. The meta-analysis now exhibits a high amount of symmetry 

that can be immediately visualised by the user. This transformation is highlighted in Figure]^ (right), 
which plots potential outcomes yi(/3o) on the horizontal versus on the vertical axis with the original 

data [ui versus 1/si) also shown for comparison. When small study bias is present, estimates from small 
studies are shifted by a relatively large amount in the horizontal dimension (in this case to the left), 
whereas those from large studies are shifted horizontally by a relatively small amount. 


Because (j) is estimated to be 0.64 in this instance, indicating under-dispersion after adjustment for small 
study bias, study precisions under the potential outcome transformation are increased in proportion to 
their size, so that the precisions of small studies are shifted vertically upwards by a small amount and 
those of large studies are shifted by a large amount. If over-dispersion had been observed after adjust¬ 
ment, we would have seen a shift vertically downwards instead. If we think the that heterogeneous bias 


model (12) is true and _LL(^. g.holds, then we would expect these data to be over-dispersed since, from 
equation (17), = 1 -|- cr|^. Therefore, we can only meaningfully interpret (j) - 1 as an estimate for 

if 0 > 1. 



5 - 

4 - 

3 - 


k 



—t -1 I I-1— 

- 1.0 - 0.5 0.0 0.5 1.0 

Estimate 


1.5 



Fig. 6: Left: The Meta-Analyzer incorporating Egger regression enacted on the Aspirin data and shown 
under the potential outcome transform. Right: Funnel plot of the original Aspirin data (yi vs 1/si, hollow 
red dots) versus its transformed counterpart (yi{(3o) vs IjcpSi, solid black dots). 
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5.1.1. Can Egger regression correct for publication bias? 


The heterogeneous bias model (12) seems well suited to the MR context. Furthermore, the work of Hedges 
and Copas do indeed imply that heterogeneous bias contaminates the outcome data, yi, under publication 
bias. This is because the selective pressure study i experiences (and therefore the magnitude of bias in its 
estimate conditional on selection) is a function of its own, unique characteristics, such as Sj. So, whilst 


model (12) does therefore appear to be a realistic data generating model for mainstream meta-analyses 


affected by small study bias, if model (12) is true then _LL(^. will be implausible because ijji will clearly 
then be a correlated with Sj. Ironically, Egger regression is therefore unlikely to adequately adjust for 
small study bias when induced by outcome dependent sampling, such as in the clinical trial context (for 
which it was originally intended). Since under-dispersion is a phenomenon that can also easily be induced 
by publication bias (Copas and Shi, 2000) this provides even stronger evidence that the Egger regression 
analysis of the Aspirin data should be interpreted with extreme caution. 


5.2. The causal effect of LDL cholesterol on heart disease risk 

There is a long an extensive literature on the association between various lipid fractions and cardiovas¬ 
cular disease, but still far from universal agreement as to whether these associations have a causal basis. 
Using summary data estimates available on 154 variants from the CARDIOGRAM consortium (CAR¬ 
DIOGRAM, 2013), we perform a Mendelian randomization analysis to look for evidence that low density 
lipoprotein cholesterol (LDL-c) has a causal effect on coronary artery disease (CAD). 

Since LDL-c levels are closely related and highly correlated with other lipid fractions, such as Triglyc¬ 
erides and high density lipoprotein, we selected only a subset of 57 variants out of the 154 that were 
most strongly associated LDL-c. The minimum p-value for the strength of association across all variants 
was 8.3e-7. This strategy should reduce the possibility of violating causal assumption (ii), but does not 
rule it out completely. Furthermore, the selected variants might well exhibit pleiotropic effects through 
completely separate pathways, and therefore be in violation of assumption (iii). For this reason, we sup¬ 
plement the standard IVW analysis with Egger regression under model 

Eigure (left) and Table shows the result of a standard additive random effects meta-analysis ap¬ 
plied to the causal effect estimates across all 57 included variants. They quantify the causal effect in 
terms of a log-odds ratio of coronary artery disease for a 1 standard deviation increase in LDL-c levels. 
Significant heterogeneity is detected in the data {P = 70%, = 0.11), despite this, a strong positive 

causal log-odds ratio of 0.37 is estimated. However, there is reason to believe this analysis to be mislead¬ 
ing, given that there exists a noticeable correlation between the magnitude of the causal effect estimate 
and its precision, indicative pleiotropy. Removal of the two relatively imprecise causal effect estimates 
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which are less than -1.5 reduces the heterogeneity considerably (adjusted r^=0.07, results not shown). 



Fig. 7: The Meta-Analyzer supporting an MR analysis of the LDL variants under an additive random 
effects model (left) and Egger regression (right). 


Model 

Parameter Est 

S.E 

t value 

p-value 


Random effects model (3) 


T 

0.37 

0.059 

6.29 

5.10e-08 

r2 

O.ll 

- 

- 

- 


Random effects model (4) 


T 

0.45 

0.053 

8.51 

1.13e-ll 


3.33 

- 

- 

- 


Egger regression model (12) 


Po 

-0.0102 

0.0046 

-2.23 

0.0298 

T 

0.632 

0.0975 

6.481 

2.66e-08 

^00 

2.11 

- 

- 

- 


Table 3: MR-Analyses of the lipids data. Estimates for g are log-odds ratios per standard deviation 
increase in LDL-c. 


Applying the multiplicative random effects model (4) to these data to remove the influence of the 
random effects variance, r^, on the overall mean estimate, yields a slightly higher causal effect estimate of 
0.45. Applying Egger regression model (12) - Figure]^ (right) and Table|^- a significant negative effect of 
pleiotropy is detected, despite the pleiotropy variance being large ((t|^ = 2.11). Consequently, the point 
estimate for g is adjusted upward to 0.63. In conclusion, although significant evidence of pleiotropy exists 
across the included variants, there is still overwhelming evidence that LDL-c is causally related to CAD 
risk. If anything, the Egger analysis suggests that the true causal effect of LDL-c is slightly masked by 
pleiotropy acting in the opposing direction. 
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6. Discussion 


In this paper we have shown that, by augmenting the funnel plot portraying a meta-analysis of study 
results with an additional pole, cord and pivot, we can give this abstract object - and the overall estimate 
that it implies - a clear physical interpretation. From its original conception and success as a science 
festival exhibit, we think the Meta-Analyzer has the continued potential to be a useful tool for educating 
and explaining the concept of weighing evidence and statistical reasoning to an even wider lay-audience, 
and hopefully to inspire the next generation of data scientists. 

We have shown that the conceptual framework the Meta-Analyzer promotes is useful for both explaining 
the rationale, and interpreting the effect of, extended modelling choices in meta-analysis to a more tech¬ 
nical, advanced audience. Finally, it has allowed us to make connections between methods to adjust for 
small study bias in met a-analysis, and for confounding bias in observational epidemiology. 

It is worth commenting on the differing way in which heterogeneity is modelled in standard meta-analysis 
and causal inference. In the former, significant differences in effect estimates across studies (as measured 
by Cochran’s Q statistic, say) is seen to provide clear evidence of underlying treatment effect heterogene¬ 
ity and the subsequent adoption of a random effects model. However in causal inference it is strongly 
assumed that all instrumental variables identify a common causal effect. Any evidence of heterogeneity 
between causal effect estimates across different genetic instruments in an MR analysis is seen as evidence 
of pleiotropy (or invalid instruments more generally), and formal testing procedures such as the Sargan 
test (Sargan, 1951) exist to identify the invalid variants responsible so that they can be removed from the 
analysis. In our investigation of the magnesium data we did indeed remove an an outlying study in order 
to demonstrate a dramatic reduction in the between trial heterogeneity. This is not encouraged within 
the general context of meta-analysis but perhaps, as in the causal inference field, it should be tolerated 
to a higher degree. 

As the subject of Mendelian randomization moves forward - facilitated by the emergence of increas¬ 
ingly rich summary data sources, it is vital that the field makes use of existing methodology developed 
for meta-analysis, particularly in the area of bias modelling. However, by viewing the assumptions of 
established methods like Egger regression from a causal perspective, it is also possible that new insights 
can be gleaned and related back to the field of meta-analysis more generally. We hope that our paper can 
act as a catalyst to help promote and further this aim. 
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The Meta-Analyzer from theory to practice 


Box 1: Sherlock Holmes and the case of the missing evidence 

‘It is 1858. A new machine has been invented to exploit the ethereal force of gravity to 
weigh evidence: The Meta-Analyzer. Its inventor, Stata Lovelace says ’’The Meta-Analyser 
can make decisions affecting Britain and her empire in the name of gold-standard fairness 
and morality”. The night before its maiden calculation (into the affect of sanitation on 
patient mortality: is it beneficial?), Professor Moriarty has stolen some evidence! Call 
Scotland Yard! And Sherlock Holmes!’ 

‘Inspector Lestrade inspects the lop-sided Meta-Analyzer (Figure (left)) and mops his 
brow’ 

Inspector Lestrade: ‘It’s simple ’olmes, just move the pivot point’ (Figure (middle)), 
‘Case closed.’ 

Sherlock Holmes: ‘Its a solution, but an inelegant one, Lestrade. Replace the missing 
studies to return the balance and yield the unbiased truth’ (Figure]^ (right)) 



Fig. 8: Left: Unbalanced Meta-Analyzer. Middle: Rebalanced Meta-Analyzer (Inspector Lestrade 
approach). Right: Rebalanced Meta-Analyzer (Sherlock Holmes approach) 
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R code for Section 5 

## given Aspirin data vectors y,s 

## G-estimation routine 

G_est = function(a){ 

w = l/s''2 

BetaO = a[l] 

yBetaO ~ 7 ~ BetaO*s 

MU = sum(w*yBetaO)/suin(w) 

L = (sum(w*(yBetaO-MU)*(s - mean(s))))*2 

} 

BetaOhat = optimize(G_est,c(-5,5))$min 
yBetaOhat = y ~ BetaOhat*s 

w = l/s''2 

mu = sum(w*yBetaOhat)/sum(w) 

>BetaOhat 
[1] 2.112803 

> mu 

[1] 0.02519816 

> ## standard Egger regression 

> summary(lm(y~s.weights=l/s''2) )$coef [, 1] 

(Intercept) s 


0.02519816 2.11280317 
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Fig. 9: The Meta-Analyzer in action at the Cambridge Science Festival 
















